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Abstract 

Several  previously  published  papers  have  cited  the  need  to  include  correlation  in  risk- 
analysis  models.  In  particular,  a  landmark  paper  published  by  Philip  Lurie  and  Matthew 
Goldberg  presented  a  methodology  for  inducing  Pearson’s  correlation  between 
input/independent  random  variables.  The  one  subject,  absent  from  the  paper,  was  a 
methodology  for  finding  the  optimal  applied  correlation  matrix  given  a  desired  outcome 
correlation.  Since  the  publishing  of  the  Lurie-Goldberg  paper,  there  has  been  continuing 
discussion  on  its  implementation;  however,  there  has  not  been  any  presentation  of  an 
optimization  algorithm  that  does  not  involve  the  use  of  computing-heavy  simulations.  This  paper 
reviews  the  general  methodology  used  by  Lurie  and  Goldberg  (along  with  its  predecessor 
papers)  and  presents  a  non-simulation  approach  to  finding  the  optimal  input  correlation  matrix, 
given  a  set  of  marginal  distributions  and  a  desired  correlation  matrix. 

Introduction 

The  Complete  Correlation  Algorithm  (CCA)  developed  by  Northrop  Grumman  and 
recently  implemented  in  NG  developed  risk  models  is  a  product  of  more  than  two  years  of 
research  and  development.  Several  previously  published  papers  have  cited  the  need  to  include 
correlation  in  risk-analysis  models;  however,  none  present  an  optimization  algorithm  that  does 
not  involve  the  use  of  computing-heavy  simulations.  In  particular,  a  landmark  paper  published 
by  Philip  Lurie  and  Matthew  Goldberg  (1998)  presented  a  methodology  for  inducing  Pearson’s 
correlation  between  input  random  variables.  This  paper  reviews  the  general  methodology  used 
by  Lurie  and  Goldberg  (along  with  its  predecessor  papers)  and  presents  the  Druker  Algorithm:  a 
non-simulation  approach  to  finding  the  optimal  input  correlation  matrix  given  a  set  of  marginal 
distributions  and  a  desired  correlation  matrix. 

The  CCA  was  deliberately  created  bearing  in  mind  identified  environmental  factors  that 
prevent  easy  implementation  of  commercially  available  models.  No  one  on  the  team  had  any 
experience  implementing  correlation  into  Monte  Carlo  simulations  beyond  the  use  of  COTS 
programs,  such  as  (@Risk™  and  Crystal  Ball™.  To  determine  the  best  development  method, 
the  following  factors  were  considered: 

1 .  The  Northrop  Grumman  risk  models  need  to  be  of  an  easily  transferable  electronic  size, 
as  the  models  are  often  shared  via  email  or  network  drives. 

2.  A  diverse  group  of  users  must  be  able  to  run  the  software  in  a  variety  of  work 
environments;  Microsoft  Office  is  the  only  platform  that  is  transferable  to  all  parties. 
Users  include  risk  practitioners,  program  managers  and  members  of  pricing 
organizations;  locations  include  unclassified  and  classified  Northrop  Grumman  facilities, 
unclassified  and  classified  customer  facilities  and  home  offices. 
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3.  Custom  implementations  are  frequent;  much  of  NGIT-TASC  risk  work  requires  risk 
simulations  to  be  built  into  pre-existing  cost  and  price  models.  These  models  are 
generally  limited  to  Microsoft  Excel  and  Access;  however,  Web-based  platforms  are  not 
unheard  of. 

The  above  concerns  drove  the  decision  to  use  Visual  Basic  source  code  to  develop  the  CCA. 

Initially,  the  development  was  focused  on  an  algorithm  that  could  induce  Pearson’s 
correlation  between  typical  distributions  in  risk  analysis:  Bernoulli  (discrete).  Triangular,  Normal 
and  Log-Normal.  By  limiting  the  problem  to  the  most-common  applications,  in  theory,  the 
solution  should  have  been  easier  to  find.  While  attempting  to  ascertain  the  maximum  correlation 
between  any  two  Bernoulli  distributions,  however,  the  general  solution  was  uncovered.  The 
resulting  algorithm  induces  Pearson’s  correlation  between  any  set  of  random  variables  (while 
still  preserving  the  marginal  distributions)  using  the  Lurie-Goldberg  Method  and  without  the  use 
of  simulation  to  find  the  optimal  applied  correlation  matrix. 

The  CCA  is  a  compilation  of  multiple  algorithms  (each  named  for  their  main  author(s)) 
from  several  sources:  existing  papers,  public  source  code  and  internally-developed  code.  Most 
of  the  algorithms  used  were  taken  from  a  variety  of  existing  papers.  Although  these  papers  all 
provided  complete  algorithms,  they  sometimes  lacked  details  in  how  to  accomplish  key  steps;  in 
cases  such  as  these,  gaps  were  filled  with  open-source  code  solutions.  The  optimization  of  the 
applied  correlation  matrix,  the  last  step  in  the  correlation  algorithm,  was  developed  entirely  by 
the  Northrop  Grumman  Team. 

Definitions  and  Assumptions 

Matrix  Definitions: 

1.  Consistent  Correlation  Matrix — Consistent  Correlation  matrices  have  diagonal  entries 
equal  to  1 .0,  all  other  entries  between  [-1 ,  1]  are  symmetric  and  positive  definite. 
Consistency  is  necessary  for  a  viable  correlation  matrix,  but  a  Consistent  Correlation 
Matrix  may  not  necessarily  be  viable  given  the  Parent  Distributions. 

2.  Input  Correlation  Matrix  (I) — The  user-inputted  correlation  matrix.  This  matrix  may  or 
may  not  be  a  consistent  correlation  matrix. 

3.  Adjusted  Correlation  Matrix  (L) — The  Input  Correlation  Matrix  adjusted  to  be  a 
Consistent  Correlation  Matrix.  This  matrix  will,  by  definition,  be  positive  definite. 
Additionally,  the  adjusted  matrix  will  be  viable  as  correlations  between  various 
distributions  of  random  variables  will  be  achievable.  When  (L)  is  generated,  the 
differences  between  (I)  and  (L)  are  minimized. 

4.  Applied  Correlation  Matrix  (A) — The  correlation  matrix  used  by  the  grand  algorithm  to 
generate  correlated  random  number  draws.  This  matrix  may  be  the  same,  or  very 
different  from,  the  Adjusted  Correlation  Matrix;  the  extent  of  the  differences  will 
depend  on  the  random  variables  to  be  correlated. 

5.  Optimal  Applied  Correlation  Matrix  (A’) — The  Applied  Correlation  Matrix  optimized 
using  the  Lurie-Goldberg  Method. 

6.  Outcome  Correlation  Matrix  (O) — The  correlation  matrix  of  the  simulated  variables 
following  the  simulation  run.  The  goal  of  the  grant  correlation  algorithm  is  for  (O)  to  be 
identical  to  (L). 
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other  Definitions: 

1 .  Parent  Distribution — The  distributions  correlated  for  use  in  the  simulation.  The 
distributions  are  simulated  using  the  Inverse  CDF  technique.  The  goal  is  to  induce  a 
desired  correlation  between  these  distributions. 

2.  Pearson’s  Correlation — parametric  statistic  that  measures  the  strength  and  direction 
of  a  iinear  reiationship  between  two  random  variabies  (“Correlation,”  2008). 

3.  Spearman’s  Rank  Correlation — A  non-parametric  statistic  that  measures  the 
monotonicity  of  a  function  without  making  any  assumptions  as  to  the  distribution  of  the 
variables. 

4.  Eigenvalues — A  scalar  (L)  associated  with  a  matrix  such  that  if  (A)  is  a  matrix  and  (X)  is 
a  vector,  AX  =  LX.  The  vector  (X)  is  known  as  the  Eigenvector  that  corresponds  to  the 

Eigenvalue  (L). 

Assumptions: 

1.  Normal  Distributions — ^Any  reference  to  the  normal  distribution,  whether  in  a  univariate 
or  bivariate  case,  is  assumed  to  be  the  Standard  Normal  distribution  (Mean  of  0, 
Standard  Deviation  of  1). 

Pearson’s  vs.  Rank  Correlation 

Most  COTS  risk  tools  use  Spearman’s  rank  correlation  as  a  substitute  for  Pearson’s 
correlation  between  parent  distributions.  Spearman’s  rank  correlation  (a  non-parametric 
statistic)  differs  from  Pearson’s  correlation  (a  parametric  statistic)  in  that  it  measures  the 
monotony  of  a  function,  whereas  Pearson’s  correlation  measures  the  strength  of  the  linear 
relationship  between  two  functions  (see  Figure  1).  Though  studies  have  shown  that,  using  the 
most  common  risk  distributions,  models  using  rank  correlation  yield  similar  results  to  those 
using  Pearson’s  (Robinson  &  Sails,  2004),  there  is  a  distinct  difference  between  the  two. 
Although  this  paper  will  not  detail  all  the  differences  between  the  two  measures,  a  quick  (and 
exaggerated)  example  is  presented  below.  The  grand  aigorithm  supersedes  the  need  to 
substitute  for  Pearson’s  correlation  with  Spearman’s  rank  correlation. 
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Cost  vs.  Weight  y  -  69. 857x  -  497.36 

R2  =  0.6554 


^  Cost  Linear  (Cost) 


Pearson's  Rho 

0.81 

Spearman's  Rho 

1.00 

Figure  1.  Pearson's  vs.  Spearman's  Rank  Correlation 

Algorithm  Overview 

There  are  three  main  steps  behind  the  grand  algorithm.  An  outline  of  these  steps 
follows,  and  the  upcoming  sections  of  this  paper  will  review  each  individual  step  in  detail. 

1 .  Correct  the  User-Input  Correlation  Matrix  (I) 

a.  Correct  I  so  that  it  is  consistent — both  in  terms  of  a  general  correlation  matrix  and 
the  properties  of  the  parent  distributions  being  correlated. 

b.  Through  these  corrections,  the  Adjusted  Correlation  Matrix  (L)  will  be  generated. 

2.  Optimize  the  Applied  Correlation  Matrix 

a.  Find  the  Optimal  Applied  Correlation  Matrix  (A’)  such  that  when  A’  is  run 
through  the  Lurie-Goldberg  Method,  the  Outcome  Correlation  Matrix  (O)  is 
identical  to  L. 

3.  Correlate  the  Input  Random  Variables 

a.  Using  A’,  apply  the  Lurie-Goldberg  Method  to  correlate  the  parent  distributions. 

For  purposes  of  presenting  the  methodology,  it  is  necessary  to  show  how  the  input 
random  variables  are  to  be  correlated  before  discussing  how  to  find  A’. 

Correcting  the  User-Input  Correlation  Matrix  (Part  I) 

Giving  users  the  ability  to  input  their  own  correlation  matrix  allows  for  the  possibility  that 
the  User-Input  Correlation  Matrix  (I)  may  not  be  a  viable  correlation  matrix.  Correlation 
matrices,  by  definition,  have  diagonal  entries  of  1.0.  All  other  entries  between  [-1,  1]  are 
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symmetric  and  are  positive  definite.  The  first  step  in  inducing  correlation  between  input  random 
variables  is  checking  whether  I  is  a  consistent  correlation  matrix.  If  it  is  not,  it  must  be  corrected 
that  it  is  such. 

The  I man-Davenport  Algorithm,  which  is  based  on  a  paper  by  Ronald  Iman  and  James 
Davenport  (1982)  is  used  to  correct  I  in  order  to  make  it  a  consistent  correlation  matrix.  While 
numerous  other  papers  have  been  published  describing  methods  to  correct  I  such  that  it  is 
altered  as  little  as  possible  (Higham,  2002),  the  Iman-Davenport  Algorithm  is  the  most 
computationally  efficient  method  the  authors  uncovered.  Given  that  additional  adjustment  may 
be  required  based  on  the  parent  distributions  being  correlated;  the  resulting  matrix  is  close 
enough  to  I  to  satisfy  this  requirement. 

The  algorithm  corrects  I  in  three  main  phases.  First,  the  algorithm  checks  whether  I  is 
symmetric  with  diagonal  entries  of  1.0  and  off-diagonal  entries  between  [-1,  1].  If  it  is  not,  the 
user  is  prompted  to  re-input  the  matrix,  correcting  for  the  discrepancies. 

Second,  once  the  above  conditions  are  satisfied,  the  algorithm  checks  whether  I  is 
positive-definite.  One  way  to  test  this  is  to  find  the  eigenvalues  for  I  (positive-definite  matrices 
have  all  positive  eigenvalues).  The  paper  referenced  did  not  describe  an  approach  for  finding 
the  eigenvalues  of  the  matrix.  After  further  research,  the  Jacobi  Eigenvalue  Algorithm  was 
determined  to  be  a  sufficiently  efficient  way  to  evaluate  a  matrix’s  eigenvalues.  As  a  result,  the 
eigenvalues  are  produced  as  the  diagonals  of  an  otherwise  zero-matrix.  The  Jacobi  Eigenvalue 
Algorithm  is  computationally  inexpensive  and  pre-existing  source  code  was  used  in  its 
implementation. 

If  all  eigenvalues  for  I  are  positive  and  the  other  conditions  have  been  satisfied,  then  I  is 
a  consistent  correlation  matrix.  Otherwise,  in  the  third  phase,  negative  eigenvalues  are  changed 
to  small,  positive  values  (e.g.,  .000001).  The  diagonal  matrix  of  adjusted  eigenvalues  is  then 
multiplied  by  the  associated  matrix  of  eigenvectors  (also  produced  using  the  Jacobi  Eigenvalue 
Algorithm).  That  product  is,  in  turn,  multiplied  by  the  inverse  of  the  matrix  of  eigenvectors  to 
arrive  at  a  new  matrix  that  is  similar,  but  not  equal  to,  I.  Lastly,  the  diagonals  are  reset  to  1 .0  as 
they  may  have  changed  during  the  transformation.  This  third  section  of  the  algorithm  is  repeated 
until  all  eigenvectors  of  the  adjusted  matrix  are  positive.  At  this  point,  the  user  input  matrix  has 
been  adjusted  such  that  it  is  a  consistent  correlation  matrix. 

Though  the  User-Input  Correlation  Matrix  is  now  a  consistent  correlation  matrix,  the 
transformation  of  I  is  not  complete  and  the  Adjusted  Correlation  Matrix  (L)  has  not  been 
determined.  As  will  be  shown  later,  depending  on  the  parent  distributions  being  correlated, 
there  may  be  a  maximum  achievable  correlation  between  any  two  of  the  variables. 

Determination  of  L  will  be  covered  later  in  the  section:  Correcting  the  User-Input  Correlation 
Matrix  (Part  II). 

Correlating  Input  Random  Variables 

In  order  to  understand  how  the  Applied  Correlation  Matrix  (A)  is  to  be  optimized  such 
that  the  Output  Correlation  Matrix  (O)  is  identical  to  the  Adjusted  Correlation  Matrix  (L),  the 
method  for  correlating  the  parent  distributions  must  first  be  discussed.  It  is  a  well-known  fact 
that  normal  random  variables  can  be  correlated  by  multiplying  a  vector  of  uncorrelated  normal 
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random  draws  by  the  Cholesky  decomposition^  of  the  desired  correlation  matrix.  The  Lurie- 
Goldberg  Method  takes  this  one  step  further  using  normal  random  variates  to  generate 
correlated  uniform  random  variates.  These  uniform  random  variates  are  then  transformed  via 
the  inverse-CDF  technique  to  generate  draws  from  the  desired  parent  distributions.  In  this 
method,  although  the  correlations  between  the  normal  random  draws  are  known,  as  these 
draws  are  transformed  into  other  distributions,  the  correlations  change.  Hence,  the  core  problem 
emerges:  how  can  the  Optimal  Applied  Correlation  Matrix  (A’)  be  uncovered  such  that  O 
matches  L?  Answering  this  question  is  key  to  implementing  the  Lurie-Goldberg  Method.  The 
authors  have  developed  an  algorithm  that  addresses  this  very  question,  without  necessitating 
any  runs  of  the  simulation.  Additionally,  they  have  begun  the  process  of  optimizing  this 
algorithm,  finding  heuristics  that  allow  it  to  run  with  a  minimal  number  of  calculations. 

Implementation  and  Application  of  the  CCA 

The  CCA’s  chief  advantage  is  that  it  is  non-recurring  and  its  implementation  requires  no 
simulation.  Furthermore,  because  the  algorithm  only  requires  looking  at  pairs  of  parent 
distributions,  once  the  applied  matrix  has  been  found  for  a  set  of  parent  distributions,  the 
algorithm  must  only  be  run  when  distributions  are  added  or  changed,  and  even  then,  only  for  the 
new/altered  distributions.  The  algorithm  also  uses  Pearson’s  correlation  while  COTS  risk  tools 
substitute  Spearman’s  rank  correlation. 

The  applications  of  the  CCA  reach  beyond  the  Cost  and  Risk  analysis  community;  this 
algorithm  is  useful  anywhere  there  is  a  need  to  induce  Pearson’s  correlation  between  input 
variables.  For  example,  this  algorithm  can  applied  to  auto  correlating,  stock  market  projections 
in  the  financial  arena  and  to  traditional  modeling  and  simulation  situations  when  correlation  is 
needed.  The  algorithm  was  designed  with  a  focus  on  portability.  Because  algorithm  is  coded 
with  Visual  Basic,  it  can  be  easily  integrated  in  existing  tools  and  models. 
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I  ntroduction 
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•  Before  moving  to  the  main  topic  of  the  paper  it  is  important  to  quickiy 
discuss  the  motivation  behind  its  deveiopment 

•  Studies  have  shown^'  ^  that  75-85%  of  DoD  programs  experience  cost 
overruns 

-  This  suggests  that  as  an  industry,  our  estimates  are  not  at  the  50’^'^  percentile,  but  rather  at 
about  the  20’^'^  percentile 

•  Recognizing  this,  agencies  are  taking  the  initiative  to  budget  at  higher 
percentiies  of  cost 

NASA  requires  all  programs  be  funded  at  the  70’^'^  percentile 

•  Constellation  at  the  65’^'^ 

-  The  Air  Force  (Dr.  Sega)  has  released  a  memo  advising  that  all  space  programs  be  funded  at 
the  80’^'^  percentile 

•  Rich  Hartley  (AFCAA)  has  advised  against  this,  recommending  programs  be  funded  at 
the  mean  of  the  AFCAA  ICE  Estimate  (generally  between  about  the  and  60’^'^ 
percentiles) 

•  I  n  order  to  determine  the  appropriate  funding  ievei  for  programs 
anywhere  but  at  the  mean,  it  is  thus  imperative  that  the  risk  and 
uncertainty  around  estimates  be  assessed 

-  Thus  S-Cun/es  must  be  developed 

1  Schaffer  2004  study,  referenced  from  Cost  Estimating  Requirements  to  Support  New  Congressionai  Reporting  Requirements.  Coonce  et. 
Al.  NASA  PM  Challenge,  February  2008 

2  2  NAVAIR  Cost  Growth  Study,  R.  L.  Coleman,  M.E.  Dameron,  CL.  Pullen,  J  .R.  Summerville,  D.M.  Snead,  34th  DoDCAS  and  ISPA/SCEA 

2001 
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Sample  Program  S-Curve 
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Program  "X" 
Cumulative  Distribution 


1  T 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

$200,000 


There  is  an  80%  probability 
that  this  program  will  be 
executed  at  or  under  budget 


Program  Budget,  $363,830  [ 
,  80.0% 


We  strongly 
recommend  that  the 
CV  be  called  out 
explicitly  -  guarding 
against  x-axis 
distortions 


Coefficient  of  Variation, 
10.70% 


Likely  Cost,  $334,656, 
50.0% 


NASA  uses  a  similar  methodology  and 
requires  all  programs  to  be  funded  at 
the  percentile  (Constellation 
programs  at  the  65*^*^) _ 


$250,000 


$300,000 


$350,000 

Total  Cost  ($K) 


$400,000 


$450,000 


$500,000 
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Cumulative  Distribution  ♦  Proposal  Value  -  -  Likely  Cost  Coefficient  of  Variation 


S- Curves 
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•  S-Curves  are  the  cumulative  distribution  function  for  the  cost  of  a  system 

-  Also  known  as  probabilistic  cost  estimates 

•  S-Curves  are  generaiiy  driven  by  two  main  factors 

Cost  Estimating  Variance 

•  Labor  estimates 

-  Data  Driven 

-  SME  Driven 

•  Escalation/I  nflation  Rates 

•  Material  Costs 

•  Productivity  (e.g.  hrs/SLOC,  hrs/ft^) 

Schedule/Technical  Risks  and  Opportunities 

•  Discrete  Events 

•  Continuous  Events 

•  Two  key  measures  are  derived  from  these  S-Curves 

Confidence  level  of  the  estimate 

•  What  is  the  probability  that  the  program  will  finish  at  or  under  budget? 

Uncertainty  in  the  estimate 

•  What  is  the  range  of  possibilities  for  the  final  cost  of  this  program? 
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statement  of  ProblenV  Motivation 
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•  Due  to  the  increased  focus  on  the  reasonableness  of  cost  estimates  across 
the  DoD  community,  a  thorough  risk  assessment  was  conducted  on  the 
CG(X)  program  estimate 

-  I  n  particular,  the  Northrop  Grumman  team  wanted  to  explore  reasons  that  cost 
growth  may  be  underestimated 

-  It  was  determined  that  the  treatment  of  correlation  in  risk  adjusted  cost  estimates 
was  one  of  the  leading  causes  of  this 

-  Correlation  directly  effects  the  CV  of  the  S-Curve 

•  I  n  order  to  correctly  capture  program  risk  at  a  lower  level,  NGIT  needed  a 
way  to  include  relational/ injected  correlation  in  our  risk  models 

-  Without  this  ability  the  top  level  CV  would  be  artificially  shrunk  due  to  the  "square 
root  of  n  problem" 

•  The  following  conditions  lead  the  team  away  from  traditional  COTS  models 

-  The  risk  analysis  module  was  to  be  incorporated  into  the  CG(X)  cost  model 

-  Both  the  cost  and  risk  models  were  to  be  transitioned  to  a  web- based  platform 

•  I  n  early  2006,  work  was  begun  on  what  would  become  the  “Cost/ Risk 
Correlation  Module" 

-  The  module  would  have  to  exist  entirely  inside  of  Excel  and  VBA  so  it  could  be  shared 
with  any  user  with  Office  2003  or  later 

-  The  module  would  have  to  be  open  enough  that  it  could  be  dropped  quickly  into  most 
home-grown  Monte  Carlo  modeis 


5 


Outline 


I  ntroduction  to  Correlation 

-  Pearson's  "Rho" 

-  Pearson's  vs.  Rank  Correlation 

The  Problem 

Correlation  Matrix  Definitions 
Correlation  in  Risk  Models 

Cost/ Risk  Correlation  Algorithm 

-  Correcting  the  user- input  matrix 

-  Correlating  the  random  variables 

-  Optimizing  the  Applied  Matrix 
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Correlation  (Pearson's) 
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•  Although  this  paper  is  not  about  correlation  itself,  it's 
important  to  briefly  review  the  two  most  common 
measures 

-  Pearson's  Product- Moment  Correlation 

-  Spearman's  Rank  Correlation 

•  When  correlation  is  discussed  in  terms  of  cost 
estimating,  Pearson's  correlation  is  generally  described 

•  Pearson's  Correlation  is  a  measure  of  the  linear 
relationship  between  two  or  more  variables 


-  This  is  as  opposed  to  Rank  Correlation,  which  will  be 
discussed  on  the  next  slide 


•  It  is  identified  using  the  Greek  symbol  p  and  is  always 
between  [-1,1] 

•  The  correlation  of  a  linear  regression  is  the  square 
root  of  r^ 


% 
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The  examples  on  the  right  show  representative  data 
sets  for  three  values  of  p 


Pearson's  Correlation  vs.  Rank  Correlation 


NOHTHROP  GRUMMAN 


•  Most  commercial  risk  programs  (e.g.  Crystal  Ball  & 
(a)Risk)  use  Spearman's  rank  correlation  rather  than 
Pearson's  correlation  because  it  is  easier  to  simulate 

•  Spearman's  rank  correlation  is  used  to  detect 
correlation  between  two  variables,  without  assuming 
a  linear  relationship 

It  is  concerned  with  whether  or  not  the  function  is 
monotonic 

•  Some  other  differences  include 

Pearson's  is  parametric,  Spearman's  is  not 
-  Spearman's  is  not  to  be  used  for  predictive  purposes 

•  I  n  the  example  to  the  right,  rank  correlation  and 
Pearson's  correlation  yield  very  different  answers 


Cost  vs.  Weight  y  -  69. 857x  -  497.36 

R2  =  0.6554 


♦  Cost  Linear  (Cost) 


Pearson's  Rho 

0.81 

Spearman's  Rho 

1.00 

•  Although  it  is  important  to  distinguish  between  these 
two  types  of  correlation,  past  research  has  shown 
that  in  cost  risk  simulations  using  the  most  common 
families  of  distributions,  the  two  yield  fairly  similar 
results^ 

-  The  aim  of  the  authors  is  to  "commit  no  avoidable  errors" 


^  Robinson,  M  and  Sails,  W.  More  on  Correlation  Accuracy  in  Crystal  Ball  Simulations  (or  What  We've  Now  Learned  atxDut 
Spearman's  R  in  Cost  Risk  Analyses).  Presented  at  the  2004  SCEA  Conference,  Manhattan  Beach,  CA,  J  une  2004 
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Correlation  in  Risk  Models 
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•  I  n  risk  analysis,  correlations  are  critical  to  successful  simulations  used  to  find 
distributions  of  cost 

-  Correlations  are  thought  to  be  widely  present  among  elements  of  cost,  but  little  data 
exists  to  determine  them,  principally  because  to  determine  correlations  among  any  set 
of  variables,  data  points  must  contain  those  variables  in  common,  and  this  is  rarely  the 
case 

-  Without  accounting  for  correlation,  summing  multiple  independent  risk  distributions  will 
lead  to  an  artificial  degradation  in  the  CV 

•  This  is  known  as  the  "Square  Root  of  N"  problem 

•  Lacking  discernable  correlations,  risk  analysts  are  forced  to  rely  on  Subject 
Matter  Experts  to  estimate  correlations 

-  These  correlations  are  subtle  and  difficult  to  estimate 

-  Estimated  correlations,  to  be  usable,  must  be  "coherent",  as  discussed  later 

•  Once  the  desired  correlation  between  all  cost  elements  is  determined,  the 
next  problem  is  to  build  these  correlations  into  the  risk  model 

•  The  following  slides  will  lay  out  the  algorithms  used  in  the  correlation  module 
and  demonstrate  how  they  were  applied  to  the  CG(X)  program 
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Definitions:  Matrices 
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•  Before  proceeding,  it  is  important  to  define  several 
matrices  that  will  be  used  in  the  algorithm 

•  I  nput  Correlation  Matrix: 

-  The  correlation  matrix  inputted  by  the  user,  may  or  may  not  be  a 
consistent  correlation  matrix 

•  Adjusted  Correlation  Matrix: 

-  The  consistent  correlation  matrix  found  by  the  model  that  is  as 
close  as  possible  to  the  I  nput  Correlation  Matrix 

•  This  matrix  is  positive  semidefinite 

•  It  is  also  coherent  given  the  distributions  being  correlated 

•  Applied  Correlation  Matrix: 

-  The  correlation  matrix  utilized  by  the  algorithm  to  generate 
correlated  random  number  draws 

•  Outcome  Correlation  Matrix 

-  The  correlation  matrix  of  the  simulation  variables  after  the 
simulation  is  run 

Ideally  it  is  identical  to  the  Adjusted  Correlation  Matrix 


User-Input  Matrix 

1.0000 

0.8000 

0.1000 

0.8000 

1.0000 

0.8000 

0.1000 

0.8000 

1.0000 

Adjusted  Matrix 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

Applied  Matrix 

1.0000 

0.7915 

0.2263 

0.7915 

1.0000 

0.7744 

0.2263 

0.7744 

1.0000 

Outcome 

1.0000 

0.7522 

0.1316 

0.7522 

1.0000 

0.7521 

0.1316 

0.7521 

1.0000 
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Definitions:  Eigenvalues/ Eigenvectors 
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•  An  eigenvector  is  a  vector  v  such  that  for  a  square  matrix  A  and  a  scalar 
K,  Av  =  Av 

•  It  follows  that  if  Q  is  an  indexed  set  of  linearly  independent 
eigenvectors  for  matrix  A  and  A  is  the  diagonal  matrix  containing  the 
corresponding  eigenvalues  of  A  as  its  diagonal  entries  then: 

A  =  QAQ-i 

•  By  altering  A,  the  diagonal  matrix  consisting  of  A's  eigenvalues,  we 
eventually  arrive  at  a  positive  definite  correlation  matrix  that  is  close  to 
the  user  input  matrix 

•  The  J  acobi  Eigenvalue  algorithm  is  used  to  find  both  the  eigenvalues 
and  eigenvectors  of  the  user  input  correlation  matrix 
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The  Cost  Risk  Correlation  Algorithm 

Correcting  the  User  I  nput  Matrix 
Correlating  the  Uniform  Random  Number  Draws 
Optimizing  the  Applied  Matrix 
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Correcting  the  User  I  nput  Matrix 
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•  As  a  rule,  correlation  matrices  must  be  positive  semidefinite 

Positive  semidefinite  matrices  have  all  non-negative  eigenvalues 

•  When  using  data  to  generate  correlation  matrices,  they  will  necessarily  be 
positive  definite 

•  Unfortunately,  when  generating  matrices  based  on  SME  judgment,  this 
condition  may  not  be  met 

•  To  correct  these  matrices,  an  algorithm  developed  by  I  man  and  Davenport^ 
was  used 

-  The  criteria  for  "closest  matrix"  that  comes  out  of  this  algorithm  is  unknown  to  the 
authors  but  it  is  computationally  efficient  and  relatively  simple  to  implement 

Because  the  generation  of  the  "closest  viable  correlation  matrix"  is  so  critical  in  finance, 
there  are  several  more  robust  algorithms  available^ 

•  The  following  slide  will  outline  the  algorithm  used  in  the  Cost- Risk  Correlation 
Module 


^  I  man,  R  and  Davenport  J .  An  I ntterative  Algorithm  to  Produce  a  Positive  Definite  Matrix  from  an 

"Approximated  Oirreiation  Matrix"  (With  a  Program  User's  Guide)  Sandia  National  Laboratories  for 
the  US  DoE,  J  une  1982 

^  Higham,  N.  Computing  the  Nearest  Correlation  Matrix  -  A  Problem  from  Finance.  I MA  J  ournal  of 
Numerical  Analysis.  2002 
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Correcting  the  User  I  nput  Matrix  -  Hurdles 


NOrtTHROP  GRUMMAN 


•  Two  hurdles  existed  in  implennenting  the  algorithm 

-  Excel  doesn't  have  a  function  that  finds  Eigenvalues  and  Eigenvectors  for  the 
correlation  matrices 

-  Excel  doesn't  have  a  function  to  compute  the  Cholesky  Decomposition  matrix 

•  Research  was  conducted  and  algorithms  (and  the  associated  VBA  source 
code)  that  conquered  both  hurdles  were  found 

-  Both  were  part  of  the  MATRIX  and  LI  NEAR  ALGEBRA  Package  For  EXCEL 
developed  by  The  Foxes  team  in  Italy 

-  The  Cholesky  Decomposition,  Eigenvalues  and  Eigenvectors  functions  were  taken 
from  this  package  and  added  into  the  tool 
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Correcting  the  User  I  nput  Matrix  -  Algorithm 


NOHTHROP  GRUMMAN 


•  The  algorithm  iteratively 
adjusts  the  eigenvalues  of 
user-inputted  correlation 
matrices  until  the  resulting 
matrix  has  all  non- negative 

•  During  each  iteration  of  the 
algorithm,  there  are  two 
adj  ustments 

1.  Adjustment  of  the  negative 
eigenvalues  to  small, 
positive  values 

2.  Adjustment  of  the  first 
adjusted  matrix's  diagonal 
entities  to  values  of  1 

•  Once  the  adj  usted  matrix 
(#2)  is  found  to  have  all  non¬ 
negative  Eigenvalues,  the 
algorithm  has  found  its 
solution 


This  algorithm  is 
iterated  until  the 
Adjusted  matrix  is 
positive  definite 


Adjusted  Matrix:  #2 

0.1199 
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Correcting  the  User  I  nput  Matrix  -  Other 
Complications 


NOmiHROP  GRUMMAN 


•  Although  the  matrix  produced  using  the  algorithm  on  the  preceding  slides  is  a  consistent  correlation 
matrix,  depending  on  the  random  variables  being  correlated  it  may  or  may  not  be  feasible 

At  least  if  the  marginal  distributions  are  to  be  preserved 

•  The  best  way  to  illustrate  this  is  to  examine  the  maximum  possible  correlation  between  two 
Bernoulli  risks 

As  shown  below,  unless  the  probabilities  of  the  two  risks  are  equal,  there  is  a  maximum  possible  correlation  between 


them 


•  The  final  step  to  correcting  the  User  I  nput  Matrix  is  to  adjust  the  matrix  so  that  all  correlations  are 
feasible  based  on  the  distributions  being  correlated 


Although  this  example  seems  odd, 
this  is  an  efficient  way  of  inducing 
conditional  probabilities  between 
Bernoulli  random  variables 


The  only  case  in  which  XY  7^  0  is  when  both  risks 
occur,  it  follows  that  E(XY)  simplifies  down  to  CfixCf2 
times  the  probability  that  both  risks  occur.  The  highest 
this  probability  can  possibly  be  is  the  minimum  of  the 
two  probabilities 


(0,  Cf2) 


O  (Cf-|  ,  Cf2)  Px,Y 


^E(X^)  (X)  4e(Y^  )  -  (7) 

^  _  Min(Pf, ,  P/, )  X  C/  X  Cf,  -  (P/  X  C/, )  X  (P/,  X  Cf, ) 


(E{XYpE{X)^) 


pf2-  Min(pfi,  pf2) 


Min(pfi,  pf2) 


p(max)^„ 


t- 


MiniPf,  ,Pf,)^Cfx  Cf,  -  (Pf,  X  C/, )  X  (Pf^  X  Cf, ) 


1-Max(pfi,pf2)  pfi-Min(pfi,pf2) 


Min(Pf„Pf,)  -PfxPf2 


Correlating  Random  Variables: 

An  I  ntroduction  to  the  Lurie-Goldberg  Method^ 


NOHTHROP  GRUMMAN 


•  The  only  method  the  authors  were  aware  of  for  inducing 
Pearson's  correlation  between  input  random  variables  is  the 
Lurie-Goldberg  Algorithm 

-  The  Lurie-Goldberg  Algorithm  aims  to  find  an  applied  correlation  matrix 
such  that  the  input  correlation  and  output  correlation  are  as  close  as 
possible 


•  Find  matrix  L  such  that  series  of  transformations 


indep. normal  mult.nonnal  unifonn  desired 

lead  to  random  variables  with  desired  correlations  and 
marginal  distributions 

L;  Cholesky  factor  transforms  independent  normals  to  correlated  nonnals 
d);  normal  c.d.f.  transfonns  correlated  nonnals  to  conelated  uniforms 
F-^ :  transfonns  conelated  uniforms  to  conelated  random  variables  with 
desired  marginal  distributions  F 


-  Unfortunately,  the  authors  could  not  find  a  method  for  finding  this 
optimal  matrix  (L...  referenced  as  A'  in  this  paper) 

-  One  obvious  solution  is  to  optimize  the  matrix  by  examining  the  post¬ 
simulation  correlations 

•  Given  the  computing  power  needed  to  complete  each  simulation,  this 
could  be  a  time  consuming  endeavor 

^Goldberg,  Matthew  S,  Lurie,  Phillip  M.  Correlating  Random  Variables,  32nd  DoDCAS,  Williamsburg,  VA. 

February  1999 


Correlating  the  Uniform  Draws: 
The  Lurie-Goldberg  Method 


NOHTHROP  GRUMMAN 


•  Once  a  viable  correlation  matrix 
exists  Uniform  (0,1)  correlated 
random  numbers  must  be 
generated  which  in  turn  are  used 
to  generate  the  desired  random 
variables 

•  To  accomplish  this,  the  Cholesky 
Decomposition  Matrix  of  the 
adjusted  matrix  is  found 

L  is  the  Cholesky  Decomposition  of 
A  iff  L  is  a  lower  triangular  matrix 
such  that: 


A  =  LL^ 

•  After  the  Cholesl^  Decomposition 
Matrix  is  found,  the  algorithm  at 
right  is  run  to  produce  correlated 
Uniform  (0,1)  random  numbers 

•  These  random  numbers,  vice  the 
originals,  are  used  in  the  risk 
model  to  generate  points  off  of 
distributions 


Adjusted  Matrix 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

U(0,1)  Random  Draws 


0.26271853333989800 


0.79616660202169400 


0.15362541632109700 


Cholesky 

Inverse  CDF 

Decomposition 

Technique 

Cholesky  Decomposition 

1.0000 

0.0000 

0.0000 

0.7522 

0.6589 

0.0000 

0.1322 

0.9907 

0.0321 

X 


Random  N(0,1) 


(0.63498673467686800) 


0.82800654029771300 


(1.02100761130346000) 


Multiply  N(0,1) 
by  Cholesky 
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Optimizing  the  Applied  Correlation  Matrix 


NOmiHROP  GRUMMAN 


•  Non-linear  transformations  are  used  to  correlate  random  variables  in  the  model 

Because  of  this,  the  outcome  correlation  may  be  different  from  the  intended  correlation 

•  The  biggest  hurdle  this  module  faced  was  in  the  correction  of  this  discrepancy 

•  Northrop  Grumman  has  developed  a  method  that  can  find  the  outcome  correlation 
matrix  for  any  applied  correlation  matrix  prior  to  the  simulation  being  run 

I  n  other  words,  the  algorithm  can  determine  Poutput  Qiven  pAppned 

-  The  applied  correlation  matrix  can  then  be  optimized  so  that  the  outcome  correlation  matrix  is  equal 
to  the  adjusted  correlation  matrix 

•  Additionally,  it  follows  from  mathematical  proofs  that  the  optimal  applied  correlation 
matrix  will  induce  the  desired  correlation 

-  This  infers  that  any  variation  in  p  in  the  simulation  runs  is  due  solely  to  Monte  Carlo  sampling  error 


Find: 


Applied  Correlation  Matrix 

1.0000 

0.7915 

0.2263 

0.7915 

1.0000 

0.7744 

0.2263 

0.7744 

1.0000 

Such  that  after  the  Lurie-Goldberg 
method  takes  place: 


Adjusted 

Correlation  Matrix 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

Outcome  Correlation  Matrix 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 

0.7522 

0.1322 

0.7522 

1.0000 
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Optimizing  the  Applied  Correlation  Matrix 


NOmiHROP  GRUMMAN 


•  The  algorithm  developed  by  Northrop  Grumman  finds  the  optimal  applied  correlation 
matrix  given: 

1.  The  parent  distributions  being  correlated 

2.  The  adjusted  correlation  matrix 

•  The  algorithm  runs  prior  to  the  simulation  being  executed  and  once  performed,  only 
needs  to  be  re-ran  as  variables  are  added  or  changed 

And  in  those  cases,  only  for  the  new/modified  distributions 

•  Although  the  algorithm  was  originally  developed  for  cost  risk  analysis,  it  has 
applications  wherever  a  user  needs  to  account  for  correlation  between  independent 
random  variables 

For  example:  the  modeling  of  mutual  fund  performance  given  it  is  made  up  of  a  group  of 
correlated  stocks  and  bonds 

•  In  fact,  the  algorithm's  first  use  is  in  the  modeling  of  conditional  probabilities 
between  Bernoulli  independent  random  variables 

The  customer  needed  an  efficient  way  to  model  the  conditional  probabilities  they  found  between 
parameters  in  their  data  while  preserving  the  marginal  probabilities 

It  can  be  shown  using  the  same  general  methodology  on  slide  14  that  Pearson's  correlation 
between  two  Bernoulli  random  variables  equates  to  a  conditional  probability  between  them 
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NORTHROP  GRUMMAN 


Application  to  the  CG(X)  Progrann  Risk 

Assessnnent 


Correlation  Data 


NOrtTHROP  GRUMMAN 


•  One  of  the  nrx)st  difficult  steps  in  the  risk  assessment  process  is  in 
determining  the  correlation  between  the  elements 

•  I  n  this  assessment,  correlation  is  currently  being  measured  using  the 
relationship  between  the  SWBS  hours  for  three  classes  of  surface 
combatants 

•  J  ust  recently,  data  was  obtained  showing  estimates  vs.  actuals,  by 
SWBS,  for  various  ships 

-  The  plan  is  to  switch  to  correlations  using  this  data  once  the  analysis  is 
complete 

•  Once  uncertainty  was  evaluated  for  each  lower  level  SWBS, 
correlation  was  applied  between  them  to  produce  the  top  level  risk 
adjusted  estimate 
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Estimating  Variance  (diagram) 


NOHTHROP  GRUMMAN 


•  This  simplifie(d  example  shows  only 
the  100  and  700  SWBS 

-  100  may  have  a  lower  level  of 

uncertainty  around  its  estimate  than 
700 


•  Using  the  correlation  algorithm, 
accurate  distributions  can  be 
generated  for  the  lower  level 
SWBSs  that,  when  added  together, 
still  produce  the  known  historical 
distribution 

-  This  allows  decision  makers  to  see 
what  areas  of  the  ship  contain  the 
greatest  variance 

-  It  also  allows  risk  to  be  applied  at 
the  1-digit- level  (see  next  slides) 


100  SWBS 


Whole  Ship  Cost 


The  bigger  ratio  of  new 
to  repeat  work  in  the  700 
SWBS  is  reflected  in  its 
larger  CV  (wider  curve) 


700  SWBS 
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100,700 


Schedule/Technical  Risks  &  Opportunities 


NOrtTHROP  GRUMMAN 


•  The  next  step  in  the  risk  assessment  was  adding  in  schedule  and 
technical  riste 

-  Opportunities  are  just  risks  with  a  negative  cost  impact  (cost  is  decreased) 

-  From  this  point  on,  risks  ^opportunities  will  be  referred  to  simply  as  risks 

•  Technical  experts  involved  in  CG(X)  from  across  the  corporation  were 
interviewed  to  produce  the  schedule/technical  risks  associated  with 
their  area  of  the  ship 

•  The  following  information  was  collected: 

-  Description  of  the  risk 

-  Probability  of  occurrence 

-  Description  of  the  impact 

•  This  is  the  consequence  of  the  risk  occurring 

-  Mitigation  plans 

•  Description  of  the  mitigation  plan 

•  Cost  of  the  mitigation  plan 

•  Probability  and  impact  if  the  risk  is  mitigated 

•  Whether  or  not  the  mitigation  plan  is  included  in  the  cost  baseline 

-  Other  areas  of  the  ship  affected  if  the  risk  were  to  occur 

•  If  a  schedule/technical  risk  increased  the  probability  of  occurrence  for 
another  risk,  this  was  captured  using  the  previously  described  correlation 
algorithm 
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Schedule/Technical  Risk  Template 


NOHTHROP  GRUMMAN 


Risk  ID: 

An  ID  used  to  identify  the  risk.  Label  Sequentially 

Risk  Description: 

The  risk  description  is  a  basic  description  of  what  the  risk  is.  In  particular, 
what  could  go  wrong. 

Probability  of  Occurrence: 

The  probability  that  the  risk  will  occur. 

Impact  Description: 

The  impact  description  is  all  the  information  that  would  be  needed  from  the 
SME  in  order  to  estimate  the  cost  impact  of  the  risk  independently. 

Wherever,  possible,  please  include  schedule  impacts  as  well 

Mitigation  Plans(s): 

The  mitigation  plan(s)  are  all  activities  that  would  lower  the  expected  value  of 
the  risk.  These  activities  do  not  have  to  completely  eliminate  the  risks,  they 
could  just  lower  either  the  probability  of  occurrence  or  cost  impact. 

Information  to  be  included: 

1 .  Cost  of  Mitigation  Plan  (both  schedule  and  $) 

2.  Affect  mitigation  plan  has  on  the  risk  (what  is  the  decrease  in  probability  or 
cost/schedule  impact) 

Other  Areas  Affected: 

Are  there  any  other  areas  of  the  ship  that  could  be  impacted  if  this  risk  were 
to  occur  (or  if  the  mitigation  plans  are  put  into  motion)?  If  so,  describe  the 
impact  and  the  area  it  would  affect.  Then,  interview  the  owner  of  that  area  to 
determine  if  there  are  anymore  residual  impacts  not  forseen  originally. 
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Schedule/Technical  Risk  Modeling 


NORTHROP  GRUMMAN 


•  Once  the  risks  are  collected,  they  were  input  into  the  model 

•  For  risks  with  mitigation  strategies,  whether  or  not  the  mitigation 
strategy  is  implemented  was  selected  using  a  drop-down  menu 

-  Mitigated  risks  (whose  cost  of  mitigation  is  not  included  in  the  cost  baseline) 
will  add  cost  to  the  baseline  cost 

-  Mitigated  risks  will  use  mitigated  probabilities  and  consequences 

•  Each  risk  is  assigned  to  a  1-digit- level  SWBS 

-  This,  along  with  the  fact  that  cost  estimating  variability  is  also  assessed  at  the 
1-digit-level,  allows  cost  distributions  to  be  produced  accurately  at  the  1-digit- 
level 

•  Risks  can  also  be  inputted  as  continuous  risks  (as  appropriate): 

-  Triangular  Distributions 

-  Normal  Distributions 

-  Log-Normal  Distributions 

-  All  of  these  distributions  can  have  probabilities  assigned  to  them  as  well 
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Schedule/Technical  Risks 
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Risk  ID 

SWBS 

Description 

Probabiiity  of 
Occurrence 

Cost  Impact 

Mitigation  Plan 
Description 

Mitigated 

Probability 

Mitigated  Cost 
Impact 

Cost  of 
Mitigation 

Mitigation 

Plan 

Implemented 

? 

1 

000 

Sample  Risk  1 

90% 

$  25,000,000 

Mitigation  Plan  1 

30% 

$  10,000,000 

$  7,500,000 

Yes 

2 

200 

Sample  Risk  2 

52% 

No 

3 

300 

Sample  Risk  3 

75% 

No 

4 

400 

Sample  Risk  4 

100% 

No 

5 

500 

Sample  Risk  5 

10% 

$  100,000,000 

Mitigation  Plan  5 

1% 

$  50,000,000 

$10,000,000 

Yes 

6 

600 

Sample  Risk  6 

25% 

$  13,000,000 

No 

7 

700 

Sample  Risk  7 

90% 

$  9,000,000 

No 

8 

800 

Sample  Risk  8 

100% 

No 

9 

900 

Sample  Risk  9 

100% 

No 

Model  Also  Accepts  Triangular,  Normal  and 
Lognormal  Risk  Distributions 
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Results 


•  Several  sets  of  results  are 
produced  automatically  by  the 
simulation  when  the  “Run 
Simulation”  button  is  hit 

•  CG(X)  Risk  Adjusted  S-Curve 

-  Shows  the  whole- ship  cost 
distribution  with  the  point  estimate 
and  its  confidence  on  the  graph 

•  CG(X)  Risk  Adjusted  Estimate  by 
1-digit- level  SWBS 

-  Shows  upside  (20^*^  Percentile), 
Probable  (50^*^  Percentile)  and 
Downside  (80^*^  Percentile)  by  1- 
di git- level  SWBS 

•  CG(X)  Risk  by  SWBS 

-  Shows  upside,  probable  and 
downside  risk  $'s  by  SWBS 

-  These  are  the  $'s  due  entirely  to  the 
risks,  not  estimating  variation 
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CG(X)  Risk  Adjusted  S-Curve 


CG(>Q  Cost  S-Curve  •  Point  Estimate  •  Upside  (20th) 
I  •  Probable  (50th)  •  Downside  (80th)  CV: 


CG(X)  Risk  Adjusted  Estimate 


SWBS 

Description 

Upside 

Probable 

Downside 

000 

Administration 

100 

Huli 

200 

Propulsion 

300 

Electric  Plant 

400 

Electonics  Systems 

500 

Auxiliary  Systems 

600 

Outfit  &  Furnishings 

700 

Weapons 

800 

Integration  &  Engineering 

900 

Ship  Assembly  &  Support 

Total 

CG(X)  Risk  by  SWBS 


SWBS 

Description 

Upside 

Probable 

Downside 

000 

Administration 

100 

Hull 

200 

Propulsion 

300 

Electric  Plant 

400 

Electonics  Systems 

500 

Auxiliary  Systems 

600 

Outfit  &  Furnishings 

700 

Weapons 

800 

Integration  &  Engineering 

900 

Ship  Assembly  &  Support 

Total 

Conclusion 
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•  The  previously  discussed  method  is  an  attempt  at  producing  a  risk 
adjusted  estimate  for  the  CG(X)  program  that  is  also  accurate  at 
the  SWBS  level 

•  This  analysis  would  not  have  been  possible  were  it  not  for  the 
creation  of  the  cost/ risk  correlation  module 


29 


